Abstract: Hepatices C viral (HCV) infection is one of the common and hazardous viral infection among injecting drug users (IDU) globally. There are several identified risk factors associated with HCV. In Data Mining, various algorithms are used to mine the necessary information from the large set of data. These are also playing a vital role in numerous disease predictions C4.5 is one of the most popular algorithms for rule based classification and pruning of decision trees. Decision Tree is a supervised classification technique, which is simple, fast and accurate for prediction and decision making. It can be applied to any domain. In this paper, C4.5 algorithm is used to rank the risk factors associated with HCV among IDU’s in India. It also identifies the most relevant attributes from a dataset, so that the input space is reduced and simultaneously the performance is improved. In addition to that, this also gives decision tree for effective decision making. From the experimental results, the C4.5 algorithm and the decision tree shows the most important factors associated with HCV infection among IDU’s in India.
Keywords: Data Mining, C4.5 Algorithm, IDU, HCV, Entropy, Decision tree